209

Chapter 5

Spectral Pattern Discovery

alysing a spectrum to discover chemicals and molecules

s been one of the important subjects in biochemistry

earch. However, spectrometric data is complex because it

often a mixture between a number of signals and a

mplicated background. The latter is also called a baseline.

e signals mixed with the background of a spectrum can be

ll discovered only after the background of a spectrum has

en accurately identified. Background estimation or

seline removal is thus the very first step to go in the area of

ectra pattern discovery. The difficulty, however, is that a

seline of a spectrum is hardly an easily estimated linear

nction (a straight curve) or a simple function. Instead, it is

mmonly a complex, unknown and a non-analytic function.

any algorithms have therefore been developed for

imating the baseline of a spectrum in the hope to extract

aks and thus discover the signals as accurately and as

rrectly as possible. Only when a baseline has been

curately estimated and removed, the number of the falsely

covered signals can be minimised and the number of the

e signals can be maximised. Among many algorithms, the

hittaker-Henderson smoother is one of the best for spectra

ttern discovery. This chapter will introduce this algorithm

d its variants as well as other algorithms which are used for

ectra pattern discovery. How these algorithms can be

plied to some real spectrometric data for spectra pattern

covery will be introduced in this chapter. How to